Logo animated banner image

Instacart Market Basket Analysis 🛍

Abdu Samaraie Final project Preview

DSA501 Summer 2022

References:

Runestone Academy HTTLADS

Kaggle.com

Intro 🛒

Using the power of Exploratory Data Analysis to squeeze every ounce of insight from the data and drive business growth.

Online orders made it easy to collect user data that we did not have access to before. Like who would of known what is the customer first instinct when buying from the store? But with online orders, we do now have this information in our hand.


Instacart is a grocery ordering and delivery app with over 500 Million products and 40000 stores serves across U.S. & Canada. Instacart provides a user experience where you will get product recommendation based on your previous orders.

Back in 2017, the company announced its first public dataset release, which is anonymized and contains a sample of over 3 million grocery orders from more than 200,000 Instacart users.


The goal of this project is to find which products customer tend to buy first when they start a new order online? Does their orders consist mostly of vegan/vegitarian/health foods? Answering those questions will give us valiable insights on customer buying habits.

READING THE DATA

Step 1: Download the data

The data for the final project can be obtained from Kaggle.

EXPLORING THE INSTACART DATASET

Step 2: explore the datasets and features:

Explain what each data set represents

Descriptions

Aisles = Aisle number with description of the aisles in the store

Departments = departments name in the store (description of items of the department

order_products__prior = what each customer bought previously

order_products__train = last thing each customer bought

orders = what each customer bought

products = all the products available in the store

aisles (features):

departments (features):

order_products__prior (features):

order_products_train (features):

orders (features):

products (features):

QUESTIONING THE DATA

4D Framework

  1. problem: What products customer tend to buy first when they start a new order online ? and Do their orders consist mostly of vegan/vegitarian/health foods?
  2. outcome: Find patterns based on the products ordered to understand the customer's needs and behaviors. Based on the product or multiple pruducts (top 3 product purchased first) Knowing which items are most frequently purchased is the first step for Instacart to optimize its software product and recommend items for customers while they shop.
  3. Action: Explore and manipulate the data and mark the frequently purchased purchased pruduct as essintial items (the type of pruduct purchased will give us customer insite and purchase behaivor) to sell more of the item
  4. Measure: number of product sold. increase/decrease

Q0: How many orders did user_id 1 have? and what are those orders?

Q1: Are order_id unique in the data set orders.csv?

Answer: each order appears only once. They are unique in that sense.

Q2: Are user_id unique in orders.csv?

Q3: How many orders does each user have?

Q4: List of the features for each data set as visually clearly as possible.

Q5: What did user_id 1 purchase on his/her last order (i.e. the "train" order)?

Q6 What is the most frequent purchased product in prior?

Answer: banana

Question: What is the most popular second item?

Q7 What is the most shopped departments?

Q8: What department gets the most sales?

Q9: What is the number of items sold by day of the week.

Answering Which product was purchased first most of the time by all customers?



What can you understand from the customer first add to cart item?


Knowing which product graps the attintion of the customer first is really important. It can help us retarget the customer with a marketing campagin to recommend a similar products he is definitly intrested in.

Solution: Products placed first in cart are the products mostly reordered. We can invest in this information and start recommeding product that are similar to the most reordered product.

Note: This is called Cross-selling or up-selling in marketing terms.

Purchasing behaviour on Departments and Aisles

Conclusion 💰

Doing this Exploratory Data Analysis allows me to understand in precise details about the customer shopping behavior on the Instacart platform. Knowing which items are first purchased is the first step for Instacart to optimize its software product and recommend items for customers while they shop.